CS 180 Final Project
Project A: High Dynamic Range
Background
In the real world, the radiance ranges significantly: the direct solar radiance is over 1000 W/m²/sr, while the shaded areas, only through the ambient, are only around 1–2 W/m²/sr. However, our digital image is ranged from 0–255 for each channel, imposing a challenge in representing real-world scenes. It's very common that some areas in the photo are overexposed, even for the best camera.
To address the issues where camera are unable to capture the full range, this project aims to implement the high dynamic range (HDR) algorithm, which includes the radiance map construction and tone mapping.
Radiance Map Construction
The first step is reconstruct the HDR radiance map - which convert the raw pixel value into the estimated radiance.
As mentioned in the paper written by Debevec and Malik (1997)'s Figure 1, there are several stages in the image acquisition pipeline. The scene radiance (L) will go through the camera lens and turn into sensor irradiance (E). After the shutter, we can get the sensor exposure (X = E * t), where t is the exposure time. After this, the CCD will convert the raw exposure into analog voltages and map it into digital values. This stage is non-linear.
Our goal is simple: given a image, we want to reconstruct/estimate the radiance of each pixel.
Given that (assume i is the pixel, j is the index of the exposed images):
$$ X_i = E_i \cdot \Delta t_j $$
As mentioned in the paper, we can denote the Z_i as the pixel value, and f as the non-linear function that convert the sensor exposure into pixel value. We can get the following formula:
$$ Z_i = f(X_i) = f(E_i \cdot \Delta t_j) $$
Denote g is the inverse function of f & take logarithm (ln f-1 ln), we can get
$$ g(Z_i) = ln(E_i) + ln(\Delta t_j) $$
The domain of the g is 0 - 255, since it need to map the pixel value into the radiance.
Smooth Term:
To ensure the g is smooth, we need to add a penality term in the objective function.
Here's the final formula that we directly take from the paper:
$$ \mathcal{O} = \sum_{i=1}^{N} \sum_{j=1}^{P} \left[ g(Z_{ij}) - \ln E_i - \ln \Delta t_j \right]^2 + \lambda \sum_{z=Z_{\text{min}}+1}^{Z_{\text{max}}-1} \left[ g''(z) \right]^2 $$
To estimate the second derivative, we can use the discrete version, so that the formula can become
$$ \mathcal{O} = \sum_{i=1}^{N} \sum_{j=1}^{P} \left[ g(Z_{ij}) - \ln E_i - \ln \Delta t_j \right]^2 + \lambda \sum_{z=Z_{\text{min}}+1}^{Z_{\text{max}}-1} \left[ g(z-1) - 2g(z) + g(z+1) \right]^2 $$
There're should be multiple options to solve this optimization, such gradient descent or direct linear solver. In the paper, the author chooses the SVD for solving this optimization problem.
We need to set up the system
$$ Ax = B $$
in order to solve this problem.
We can assume all terms is equal to 0, so that it will have us the smallest value for the objective function. This means:
$$ g(z-1) - 2g(z) + g(z+1) = 0 $$ and $$ g(Z_{ij}) - \ln E_i - \ln \Delta t_j = 0 $$
For the second equation, we can rewrite it as:
$$ g(Z_{ij}) - \ln E_i = \ln \Delta t_j $$
We can represent x as the following:
$$ \begin{bmatrix} g(0) \\ g(1) \\ g(2) \\ \vdots \\ g(255) \\ E_0 \\ E_1 \\ \vdots \\ E_i \end{bmatrix} $$
Then, we can fill in the values for each row in A and b to set up this equation. Notice that we put different weights on each equation. Pixel values that are closer to the midpoint range are weighted more heavily, as they are likely to be more accurate.
Here's our visualization result for the first test case (arch):


Tone Mapping
As we can see in the previous image of arch, the radiance map is still not look great. We would need to do the tone mapping through applying operators. This will significantly make the result better.
There're two different kind of tone mapping: global and local. For global tone mapping, we will need to stretch the pixel values to fill in [0, 255], through color scaling and the following operator:
$$ E_{display} = E_{world} / (1 + E_{world}) $$
In addition, we would also need to do the local tone tuning. As mentioned in Chiu et al. 1993, we need to seperately process the low frequency and high frequency. Specifically, the implementations will invovle reduce the contrast of low frequencies (undergo dynamic range compression) and keep the detailed layer untouch.
A simply Gaussian filter wouldn't work for seperate the frequency, since it will cause the halo artifacts. At edges, the brightness from the bright side "leaks" into the dark region. Therefore, we will need to introduce the bilateral filter.
The formula is as follows (credit: from CS 184's Image Processing lecture slide):
$$ \text{BL Filter}[I](x, y) = \sum_{i, j} f\left(\left\| I(x - i, y - j) - I(x, y) \right\|\right) G(i, j) I(x - i, y - j) $$
Here's the result from all testing images (Durand is after applying the tone mapping):








Project B: Image Quilting
Background
In this project, we're going to implement texture synthesis from a small sample. We will also show some examples of texture transfer using a similar algorithm. Our goal is to blend the patches seamlessly and generate a coherent result.
Randomly Sampled Texture
The initial idea is to randomly sample a small square patch from the original texture and place it into the output sample image. However, it is very possible that the edges between adjacent patches will look unnatural since there is no constraint on the transition.
Here's a sample result:

As we can see in the above sample, the edges of the bricks is mismatched between each pair of patch.
Overlapping Patches
One simple improvement is to allow the patch to overlap with its neighbors (specifically the top and left patches) when generating the samples. The goal is to select a new patch from the texture that is most similar to the existing template. While an iterative approach using a for-loop can work, it will be significantly slower. The formula is as follows:
$$ SSD(T, I, M) = \sum_{i, j} M(i, j) \cdot (T(i, j) - I(i, j))^2 $$
The optimization will just expand the formula, and convert to numpy calculation, so it will be much faster.
Once we calculate the cost image (which is the SSD value for selecting the patch center at (i, j)), we will sort the values and pick the top k ones. A higher k value will result in a more stochastic texture, but it may also introduce more errors.
Here's a sample result:
As we can see from the above comparison, the bricks are placed in a more organized manner, and the edges are aligned.
Seam Finding
There are still some edge artifacts from the overlapping patches. We can remove these by finding the min-cut of the bndcost. The bndcost is simply the squared differences between the output image and the new patch, masked by the overlap region. The idea is to find the shortest path or cost for the valid seam. We directly apply the cut function provided in the utilities.
Here's the result (parameters: patch size: 75, overlap: 43, tot: 50):

As we can see in the comparison above, the number of edge artifacts are greatly reduced, and it's generally have a natural looking result.
Here're some additional texture:
White:

Scratch:

Texture Transfer
In texture transfer, the implementation is similar to quilt_cut, except that we add an additional cost function to match the sample patch with the target image's corresponding map. Specifically, we need to calculate the luminance difference between the patch and the target. Additionally, we need to include the original cost map to ensure the transition between patches is as smooth as possible.
Here's the result (transfer Feynman's photo into sketch's texture):

Bells & Whistles
Iterative Texture Transfer Method
As mentioned by Prof. Efros in the paper, we should iteratively generate the synthesized texture transfer. The process starts with a large patch size and reduces it by one-third in each iteration. The output image from the previous iteration is used as guidance for generating the next image, so we need to include this cost in the optimization problem.
Here's some comparisons:
Texture: scratch

Texture: bricks

While the result for the bricks texture looks similar (partly because the transitions are relatively smooth for those brick patches), the iterative approach produces significantly better results in terms of details. For example, if we look at Feynman's eyes, the non-iterative approach shows a clear boundary between patches because it tries too strictly to align with the target image's illuminance. By contrast, the iterative approach generally produces better and more accurate details.